RELIABILITY ENGINEERING — A LIFE CYCLE APPROACH 
INSTRUCTOR’S MANUAL 
CHAPTER 1 
The Monty Hall Problem 


The truth is that one increases one’s probability of winning by changing one’s choice. The 
easiest way to look at this from a probability point of view is to say that originally there is a 
probability of '4 over every door. So there is a probability of ’4 over the door originally chosen, 
and a combined probability of % over the remaining two doors. Once one of those two doors 
is opened, there remains a probability of over the door originally chosen, and the other 
unopened door now has the probability 7“. Hence it increases one’s probability of winning the 
car by changing one’s choice of door. 


This does not mean that the car is not behind the door originally chosen, only that if one were 
to repeat the exercise say 100 times, then the car would be behind the first door chosen about 
33 times and behind the alternative choice about 66 times. Prove for yourself using Excel! 


Another way to prove this result is to use Bayes Theorem, which the reader can source for 
himself on the internet. 


Assignment 1.2: Failure Free Operating Period 


The FFOP (Failure Free Operating Period) is the time for which the device will run without 
failure and therefore without the need for maintenance. It is the Gamma value for the 
distribution. From the list of failure times 150, 190, 220, 275, 300, 350, 425, 475, the Offset is 
calculated as 97.42 hours — say 100 hours. This is the time for which there should be no 
probability of failure. It will be seen from the graph in the software with Beta = 2 that the 
distribution is of almost perfect normal shape and that the distribution does not begin at the 
origin. The gap is the 100 hours that the software calculates when asked. 


When the graph is studied for Beta = 2 it will be seen that there is a downward trajectory in the 
three left hand points. If this trajectory is taken down to the horizontal axis it is seen to intersect 
it at about 120 hours. This is the estimation of Gamma. In the days before software this was 
always the most unreliable estimate of a Weibull parameter and the most difficult to obtain 
graphically. 

Assignment 1.3 

When the offset is calculated it is seen to be negative at — 185.59 (say 180). This indicates that 
the distribution starts before zero on the horizontal axis. This is the phenomenon of shelf life. 


Some items have failed before being put into service. This can apply in practice to rubber 
components and paints, for example. 


1 


https://gioumeh.com/product/reliability-—engineering-a-life-cycle-approach-solution/ 


Assignment 1.4: The Choice between Two Designs of Spring 


DESIGN A DESIGN B 
Number Cycles to Failure Number Cycles to Failure 
1 726044 1 529082 
2 615432 2 729000 
3 807863 3 650000 
4 755000 4 445834 
5 508000 5 343280 
6 848953 6 959900 
7 384558 7 730049 
8 666600 8 973224 
9 555201 9 258006 
10 483337 10 730008 


Using the WEIBULL-DR software for DESIGN A above we get 

p=4 

Correlation = 0.9943 

F4ook = 8% (measured from the graph in the Weibull printout below Fig M1.4 Set A) 
Hence Raoox = 92% 


For DESIGN B we get from the WEIBULL-DR software (not shown here) 
B=2 

Correlation = 0.9867 

F4o0k = 20% 

Hence Rook = 80% 


Hence DESIGN A is better 


From Fig 1.4.1 Set A we can read in the table that for F = 1% at 90% confidence, the R value 
is 126922 cycles. For an average use of 8000 cycles per year we get 126922/8000 = 15.86 years 
A conservative guarantee would therefore by 15 years. 

NOTE: The above calculations ignore the y value. If this is calculated, the following figures 
emerge as shown in Fig 1.4.2 (the obscuration of some of the figures is the way the current 
version of the software prints out) 


DESIGN A 

B=3 

y = 101 828.6 say 100 000 

For F= 1% at 90% confidence, F = 176149 
Dividing by 8000 we get 176149/8000 = 22 years 
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WEIBULL-DR WEIBULL DISTRIBUTION SUMMARY REPORT 
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* 50% confidence is the most probable value 
This is represented by the red line in the above graphic. 


Fig 1.4.1 Set A 


A figure of 22 years or even 15 years for any guarantee is very long indeed. Company policy 
would have to be invoked — there are matters to consider in the determination of guarantees 
other than the test data provided. These matters could include corrosion, user abuse etc. Such 
factors are more likely to occur, the longer the operating period. Questions need to be asked 
such as is there an industry standard for such guarantees, what are competitors offering as 
guarantees, etc. 


A further point to note is that DESIGN B exhibits very peculiar characteristics if the y value is 
taken into account. The B value remains at 2 but the y value is negative at over 50 000 cycles! 
This implies that there is a probability of failure before entering service. This data looks suspect 
and further tests should be done to confirm the reliability characteristics of DESIGN B. 
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WEIBULL-DR™ WEIBULL DISTRIBUTION SUMMARY REPORT 
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This is represented by the red line in the above graphic. 


Fig 1.4.2 Set A with y Calculation 


ASSIGNMENT 1.5: Rolling Element Bearings 


This assignment requires a full report as detailed in the text of the book and is therefore outside 
the scope of this Instructor’s Manual 
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ASSIGNMENT 1.6: 
Case 1.1: Weibull Analysis at the New Era Fertilizer Plant 
Here we are dealing with only three failures, but many suspensions 


The data table appears as shown below: 


Time to Failure | Failure or Suspension | Number of F or S | Rank Mean Rank 
Order =F% 

250 hours F 1 1 1/25 = 4% 
250 hours S 7 

600 hours F 1 2.41 2.41/25=9.6% 
600 hours S 7 

2000 hours F 1 4.92 4.92/25=19.7% 
2000 hours S 7 


When using the software the B value comes out as Zero! Graphical methods indicate a value of 
about 0.7 — certainly less than 1. If the mean ranks calculated above are plotted as the F values 
on Weibull graph paper, against the three failure values, the B value comes out as 0.7. Any 
value between 0 and 1 indicates a hyper-exponential distribution ie a quality problem. We do 
not have enough information to be sure of this as we only have three data points. We have only 
three failures, but the correlation is very high — the three points lie almost exactly on a straight 
line. 


Since values of B less than unity indicate a quality problem, then either the manufacturer is 
selling poor quality bellows or they are being damaged on installation. 


Even without using Weibull analysis we can see from the failure pattern that something is 
wrong here. One set of bellows lasted till 2000 hours until the first failure. Perhaps those seven 
discarded bellows a 150 hours might also have lasted at least 2000 hours. The same goes for 
the ones discarded at 600 hours. 


Failure analysis like this is like detective work. We pick up clues and follow where they lead 


This means we can now formulate a plan of action: 

1. When the next failure occurs, only replace the failed item. We can build up set of eight 
or so failures like this — five to add to the three that we have already. 

2. When the next failure occurs, we must observe the installation to see if the bellows is 
being damaged when fitted. 

3. We must visit the manufacturer’s works and study his production and quality control 

4. Also study past records from the company’s SCADA! system to see whether the 
bellows have been subjected to out-of-specification conditions, especially high 
temperatures or high pressure. Reliability only applies to defined operating conditions. 

5. Check on the storage of the bellows — rubber components can exhibit shelf life 
problems, perhaps leading to premature failures after installation. How long are the 
bellows stored and under what conditions? Rubber items should be stored in a dark, 
cool room. 


! SCADA: Supervisory Control and Data Acquisition 
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The object of this assignment is to demonstrate that we must use whatever data we have in the 
interests of solving a problem. We will never have perfect data — the cost of perfect information 
is infinite. As this example demonstrates, the little bit of data that we have at least allows us to 
formulate a plan of investigation to establish the true situation. 


ASSIGNMENT 1.7: 


Case 1.2: The Life History of a Hillman Vogue Sedan 


1. The question is to find the following in the repair record: 
e Infant mortality failure 
e Incomplete repair 
e Life extension 
e Indication of failure 
e Retrofit 
e Visual Inspection 
e Preventive Measure, or Proxy Replacement ie replacing something so that 
something else does not fail 
e Root Cause Analysis 


The answers are given in bold face italics in the table below: 


Kilometres Repair Action 

42 300 New Clutch: This could well be an infant mortality failure as the clutch had a very short 
life 

89 500 Retrofit inline fuel filter Retrofit as stated 

140 500 New head gasket, valve grind As regards the engine as a whole, this was an incomplete 


repair as the head gasket failed again at 170 000 km 


140 900 New clutch plate, engine rebuild Life Extension 


170 000 Blown head gasket — car sold as scrap 


2. The main issue in this case is that a Bathtub is present (Figure 1.17 in the case: Major 
Repair Cost vs Years of Service). But this is not the failure rate vs time bathtub of the 
literature. It is a cost bathtub. Many systems are withdrawn from service when the costs 
maintain the system go to high, not because the failure rate is increasing. 


3. The difference between the two models was that the 1976 model was a “parts bin special” — 
cars were being put together with what remained of component production after manufacture 
of certain items had all but ceased. And quality sometimes tails off at the end of a production 
run as workers and management loose interest, and as special “one-off” parts might have to be 
made. 
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CHAPTER 2 
FMECA of a Scraper Winch 


The FMECA for this simple piece of equipment is many pages long! This is typical for this 
type of study. Only a sample of the tabulation is given here. Definitions for Severity and 
Probability of Occurrence are given below. Recommendations proceeding from the FMECA 
are given after the tabulation. 


Effect Severity 


For Effect Severity, a scale of 1 to 5 was used. The Effect Severity was rated on how the 
specific failure will influence the main purpose of the winch, being drum rotation to wind up 
the scraper rope in order to pull the scraper. 


1 — Low Probability for the drums to not be able to rotate after the failure has occurred. 


2 — Medium to Low Probability for the drums to not be able to rotate after the failure has 
occurred. 


3 — Medium Probability for the drums to not be able to rotate after the failure has occurred. 


4 — Medium to High Probability for the drums to not be able to rotate after the failure has 
occurred. 


5 — High Probability for the drums to not be able to rotate after the failure has occurred. 
Occurrence Probability 

For Occurrence Probability, a scale of 1 to 5 was also used. 

1 — Low Probability of the failure occurring. 

2 — Medium to Low Probability of the failure occurring. 

3 — Medium Probability of the failure occurring. 

4 — Medium to High Probability of the failure occurring. 


5 — High Probability of the failure occurring. 


Recommendations from the FMECA 
Design features to improve reliability as identified by following the FMECA process include: 
1. To minimise gearbox damage, the gearbox is a sealed unit. 


2. To minimise the probability of the motor pinion coming loose, the motor shaft is tapered and 
so is the pinion bore and key. There is also a lock washer and lock nut to secure the motor 
pinion. 


3. To withstand greater loads and minimise bearing damage, duplex bearings and oil seals are 
fitted for the clutch gear bearings and the main shaft bearings. 


4. The pedestal bearing is easily accessible for the replacement thereof. 
5. Between the drums a curved flat bar section is provided to prevent the rope from coiling 


between the drums. 
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6. Modern winch motors are purposely designed and built to operate at large slip angles. 


7. Pressed sleeves are fitted to the shafts to locate the gears and bearings. 


8. All interference fit components are factory pressed with a 100 ton press. 


9. The modern scraper winch is of a very robust design in order to survive underground 
transport and operations. 


Item Function Failure Effects Severity Cause Probability Control 
of the Effect of Action 
Occurrence 
Motor Power source | Insulation Motor failure 3 Motor cannot | 3 Design motor 
breakdown stand load to operate at 
variation large slip 
values 
Motor Power source | Water ingress | Motor failure 5 Phase to 3 Improved 
phase or sealing 
phase to earth 
Motor Shaft Torque Shaft failure Winch will 5 Design torque | 1 Redesign for 
transmission not operate exceeded adequate 
strength 
Motor Pinion Torque Pinion comes Damage to 5 Incorrect fit 1 Redesign 
transmission loose on shaft | gears and on shaft with tapered 
shaft shaft, double 
keyway, 
locknut and 
lockwasher 
Motor Pinion Torque Tooth Degradation 5 Lack of 1 Change to 
transmission damage failure lubrication sealed for life 
gearbox with 
synthetic oil 
Motor Pinion Torque Tooth Degradation 5 Lubrication 1 Change to 
transmission damage failure contamination sealed for life 


gearbox with 
synthetic oil 


Table 2.1 Sample of the first page of an FMECA 


Fault Tree Assignment 


The required fault tree is given below: 
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Basement Flooded 


L: Failure of Pump System A: Rate of Inflow exceeds Pump 


System Capacity 


B: Inflow to Basement IK: Primary Pump Failure M: Back-up Pump Failure 


AN 


C: Power Outage | |D: Primary Pump Failure 


N 


J: Battery Drained H: Back-up Pump 
Malfunction 


E: Failure of Superintendent F: Period of Power Outage G: Period of Inflow 
to take Action Exceeds Battery Capacity Exceeds Battery 


Capacity 


The Probabilities 


The probabilities of the various events are given in the table below, as repeated from the book: 
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Event Description Probability 


Back-up pump failure 0.05 


P(J) = Probability of Battery Drainage = P(E) x P(F) x P(G) 
= 0.2 x 0.05 x 0.5 
= 0.005 


P(M) = Probability of Back-up Pump Failure = P(J) + P(H) — P(J) x P(H) 
= 0.005 + 0.05 — 0.005 x 0.05 
~ 0.055 


P(K) = Probability of Primary Pump Failure = P(C) + P(D) — P(C) x P(D) 
=0.1+0.1-0.1x0.1 
=0.19 


P(L) = Probability of Failure of the Entire Pump System = P(B) x P(K) x P(M) 
= 0.95 x 0.19 x 0.055 
= 0.0099 


Therefore the probability of the basement flooding is: 
P(A) + P(L) — P(A) x P(L) 
= 0.05 + 0.0099 — 0.05 x 0.0099 
= 0.0599 — 0.00049 
= 0.059 ie ~ 0.06 


This is a yearly probability. In other words, the basement is predicted to flood about 6 times in 
a hundred years. But because all the mathematics is based on random failure, such a flood could 
occur at any time — even next year. But on average, floods should occur every 17 years. 


Notice also that the inflow into the sump (Probability B) must occur in the diagram. If the pump 
system fails but there is no inflow, there is no flood. Hence P(B) and its complement, P(A) 
must occur in the fault tree. 
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